Method for in vitro diagnosis or prognosis of testicular cancer

ABSTRACT

The invention relates to a method for in vitro diagnosis or prognosis of testicular cancer which comprises a step of detecting the presence or absence of at least one expression product from at least one nucleic acid sequence selected from the sequences identified in SEQ ID NOS: 1 to 6 or from the sequences which exhibit at least 99% identity with one of the sequences identified in SEQ ID NOS: 1 to 6, to isolated nucleic acid sequences and to the use thereof as a testicular cancer marker.

Testicular cancer represents 1 to 2% of cancers in men, and 3.5% of urological tumors. It is the most common tumor in young men, and rare before 15 years of age and after 50 years of age. The risk is highest in patients who are seropositive for HIV. Seminoma is the most common form of testicular cancer (40%), but many other types of cancer exist, among which are embryonic carcinoma (20%), teratocarcinoma (30%) and choriocarcinoma (1%).

The diagnosis of testicular cancer is first clinical: it often presents in the form of a hard and irregular swelling of the testicle. An ultrasound confirms the intratesticular tumor and Doppler ultrasound demonstrates the increase in vascularization in the tumor. In some cases, a magnetic resonance examination (testicular MRI) can be useful. A thoracic, abdominal and pelvic scan makes it possible to investigate whether there is any lymph node involvement of the cancer. A blood sample for assaying tumor markers is virtually systematic. It makes it possible to orient the diagnosis of the type of tumor. Two main tumor markers are used and assayed in the blood: β-HCG and α-foetoprotein. However, these markers are not very specific and, furthermore, if the concentration of these markers is at physiological levels, this does not mean that there is an absence of tumor. At the current time, the final diagnosis and final prognosis are given after ablation of the affected testicle (orchidectomy), which constitutes the first stage of treatment. Next, depending on the type of cancer and on its stage, a complementary treatment by radiotherapy or chemotherapy is applied. There is therefore a real need for having markers which are specific for testicular cancer and which, in addition, make it possible to establish as early a diagnosis and prognosis as possible.

The rare event represented by the infection of a germline cell by an exogenous provirus results in the integration, into the host's genome, of a proviral DNA or provirus, which becomes an integral part of the genetic inheritance of the host. This endogenous provirus (HERV) is therefore transmissible to the next generation in Mendelien fashion. It is estimated that there are approximately a hundred or so HERV families representing approximately 8% of the human genome. Each of the families has from several tens to thousands of loci, which are the result of intracellular retrotranspositions of transcriptionally active copies. The loci of the contemporary HERV families are all replication-defective, which signifies loss of the infectious properties and therefore implies an exclusively vertical (Mendelien) transmission mode.

HERV expression has been particularly studied in three specific contexts, placentation, autoimmunity and cancer, which are associated with cell differentiation or with the modulation of immunity. It has thus been shown that the envelope glycoprotein of the ERVWE1 locus of the HERV-W family is involved in the fusion process resulting in syncytiotrophoblast formation. It has, moreover, been suggested that the Rec protein, which is a splice variant of the env gene of HERV-K, could be involved in the testicular tumorogenesis process. However, the following question has not yet been answered: are HERVs players or markers in pathological contexts?

The present inventors have now discovered and demonstrated that nucleic acid sequences belonging to loci of the HERV-W family are associated with testicular cancer and that these sequences are molecular markers for the pathological condition. The sequences identified are either proviruses, i.e. sequences containing all or part of the gag, pol and env genes flanked on the 5′ and on the 3′ by long terminal repeats (LTRs), or isolated LTRs. The DNA sequences identified are respectively referenced as SEQ ID Nos. 1 to 6 in the sequence listing.

The subject of the present invention is therefore a method for in vitro, diagnosis or prognosis of testicular cancer, in a biological sample from a patient suspected of suffering from testicular cancer, which comprises a step of detecting at least one expression product from at least one nucleic acid sequence of the endogenous retroviral family called HERV-W, said sequence being selected from the sequences identified in SEQ ID Nos. 1 to 6 or from the sequences which exhibit at least 99% identity, preferably at least 99.5% identity, and advantageously at least 99.6% identity, with one of the sequences identified in SEQ ID Nos. 1 to 6.

The percentage identity described above has been determined while taking into consideration the nucleotide diversity in the genome. It is known that nucleotide diversity is higher in the regions of the genome that are rich in repeat sequences than in the regions which do not contain repeat sequences. By way of example, D. A. Nickerson et al.,^([1]) have shown a diversity of approximately 0.3% (0.32%) in regions containing repeat sequences.

The expression product which is detected is preferably at least one mRNA transcript of at least one of the sequences SEQ ID Nos. 1 to 6, but this can also be a polypeptide which is the product of translation of at least one of said transcripts.

When the expression product is an mRNA transcript, it is detected by any suitable method, such as hybridization, sequencing or amplification. The mRNA can be detected directly by bringing it into contact with at least one probe and/or at least one primer which are designed so as to hybridize, under predetermined stringency conditions, to the mRNA transcripts, demonstrating the presence or absence of hybridization to the mRNA and, optionally, quantifying the mRNA. Among the preferred methods, mention may be made of amplification (for example, RT-PCR, NASBA, etc.) or else Northern blotting. The mRNA can also be detected indirectly on the basis of nucleic acids derived from said transcripts, such as cDNA copies, etc.

Generally, the method of the invention comprises an initial step of extracting the mRNA from the sample to be analyzed.

First, the method can comprise:

-   -   (i) a step of extracting the mRNA from the sample to be         analyzed,     -   (ii) a step of detecting and quantifying the mRNA of the sample         to be analyzed,     -   (iii) a step of extracting the mRNA from a healthy sample,     -   (iv) a step of detecting and quantifying the mRNA of the healthy         sample,     -   (v) a step of comparing the amount of mRNA expressed in the         sample to be analyzed and in the healthy sample; if the amount         of mRNA expressed in the sample to be analyzed is determined as         being greater than the amount of mRNA expressed in the healthy         sample, this can be correlated with the diagnosis or prognosis         of a testicular cancer;         -   and in particular:     -   (i) extraction of the RNA to be analyzed from the sample,     -   (ii) determination, in the RNA to be analyzed, of a level of         expression of at least one RNA sequence in the sample, said RNA         sequence being the product of transcription of at least one         nucleic acid sequence selected from the sequences identified in         SEQ ID Nos. 1 to 6 or from the sequences which include, at least         99% identity, preferably at least 99.5% identity, and         advantageously at least 99.6% identity, with one of the         sequences identified in SEQ ID Nos. 1 to 6, and     -   (iii) comparison of the level of expression of said RNA         sequence(s) defined in (ii) with the level of expression of said         RNA sequence(s) in a noncancerous biological sample; if the         level of expression of the RNA to be analyzed is determined as         being greater than the level of expression of the RNA extracted         from the noncancerous biological sample, this can be correlated         with the diagnosis or prognosis of a testicular cancer.

The transcripts are overexpressed in testicular tumors. In order to detect such an overexpression, a reference point may be necessary, i.e. a control. The amount of mRNA in the healthy sample serves as a reference standard to which the amount of mRNA in the sample to be analyzed can be compared, it being possible for an overexpression of mRNA in the sample to be analyzed, compared with the expression of mRNA in the healthy sample, to be correlated with a diagnosis or prognosis of a testicular cancer. However, since transcription is generally negligible or even nonexistent in the healthy sample, whereas it is significantly higher in the cancer sample, a reference point is not essential, the significant expression of transcripts being an indicator of the disease.

The term “overexpressed sequence” is intended to mean an mRNA sequence which is found in greater amounts or at higher levels than those found for the same mRNA sequence derived from the same type of sample, but which is noncancerous, constituting the reference threshold value.

The sequences of said transcripts are respectively identified in SEQ ID Nos. 7 to 12 (given with reference to the genomic DNA):

SEQ ID No. 7=transcript of the HW4TT locus,

SEQ ID No. 8=transcript of the HW2TT locus,

SEQ ID No. 9=transcript of the HW13TT locus,

SEQ ID No. 10=transcript of the HWXTT locus,

SEQ ID No. 11=transcript of the HW21TT locus,

SEQ ID No. 12=transcript of the ERVWE1 locus.

When the expression product is a polypeptide derived from the translation of at least one of the transcripts, it can be detected, in the method of the invention, using at least one binding partner specific for said polypeptide, in particular an antibody, for example a monoclonal antibody. The method for producing monoclonal antibodies and the selection process are well known to those skilled in the art.

By way of illustration, polypeptide sequences are described and identified in SEQ ID Nos. 14, 16, 18, 20, 22 and 24:

SEQ ID No. 14=Gag protein of HW4TT,

SEQ ID No. 16=protease of HW4TT,

SEQ ID No. 18=Gag protein of HW2TT,

SEQ ID No. 20=protein of HW2TT,

SEQ ID No. 22=Gag protein of HW13TT,

SEQ ID No. 24=Gag protein of HW21TT

SEQ ID No. 26=Env protein of ERVWE1 (Syncytin-1).

The sample from the patient will generally comprise cells (such as the testicular cells). They may be present in a tissue sample (such as the testicular tissue) or be found in the circulation. In general, the sample is a testicular tissue extract or a biological fluid, such as blood, serum, plasma, urine or else seminal fluid.

The subject of the invention is also an isolated nucleic acid sequence which consists of:

-   -   (i) at least one DNA sequence selected from the sequences SEQ ID         Nos. 1 to 6, or     -   (ii) at least one DNA sequence complementary to a sequence         selected from the sequences SEQ ID Nos. 1 to 6, or     -   (iii) at least one DNA sequence which exhibits at least 99%         identity, preferably at least 99.5% identity, and advantageously         at least 99.6% identity, with a sequence as defined in (i) and         (ii), or     -   (iv) at least one RNA sequence which is the product of         transcription of a sequence selected from the sequences as         defined in (i), or     -   (v) at least one RNA sequence which is the product of         transcription of a sequence selected from the sequences which         exhibit at least 99% identity, preferably at least 99.5%         identity, and advantageously at least 99.6% identity, with a         sequence as defined in (i), or     -   (vi) at least one RNA sequence selected from the sequences SEQ         ID Nos. 7 to 12; and         -   the use of at least one isolated nucleic acid sequence, as a             molecular marker for in vitro diagnosis or prognosis of             testicular cancer, in which the nucleic acid sequence             consists of:     -   (i) at least one DNA sequence selected from the sequences SEQ ID         Nos. 1 to 6, or     -   (ii) at least one DNA sequence complementary to a sequence         selected from the sequences SEQ ID Nos. 1 to 6, or     -   (iii) at least one DNA sequence which exhibits at least 99%         identity with a sequence as defined in (i) and (ii), or     -   (iv) at least one RNA sequence which is the product of         transcription of a sequence selected from the sequences as         defined in (i), or     -   (v) at least one RNA sequence which is the product of         transcription of a sequence selected from the sequences which         exhibit at least 99% identity with a sequence as defined in (i),         or     -   (vi) at least one RNA sequence selected from the sequences SEQ         ID Nos. 7 to 12.

FIGURES

FIG. 1 represents the principle of the WTA method for amplifying RNAs.

FIG. 2 represents a synoptic scheme of the nature and the sequence of the various steps for preprocessing DNA-chip data according to the RMA method.

FIG. 3 illustrates the nomenclature, the position and the structure of the HERV-W loci overexpressed and exhibiting a loss of methylation in the tumoral testicle.

FIG. 4 is a histogram representing the increase in expression of the five loci (HW4TT, HW2TT, HW13TT, HWXTT and HW21TT), respectively, in three pairs of testicular samples (testicle 1, testicle 2 and testicle 3), based on a comparative tumor sample/healthy sample quantification. The loci are represented along the x-axis and the factors of increase of expression between tumor tissue and healthy tissue are represented along the y-axis.

FIGS. 5 to 10 represent the methylation status of the U3 region of unique LTR or of the 5′ LTR of the various loci, respectively HW4TT, HW2TT, HW13TT, HWXTT, HW21TT and ERVWE1 in the healthy testicle (normal) and in the tumoral testicle derived from the same patient, after amplification and analysis of the sequences obtained.

EXAMPLES Example 1 Identification of HERV-W loci Expressed in Cancerous Tissues

Method:

The identification of expressed HERV-W loci is based on the design of a high-density DNA chip in the GeneChip format proposed by the company Affymetrix. It is a specially developed, custom-made chip, the probes of which correspond to HERV-W loci. The sequences of the HERV-W family were identified from the GenBank nucleic databank using the Blast algorithm (Altschul et al., 1990) with the sequence of the ERVWE1 locus, located on chromosome 7 at 7q21.2 and encoding the protein called syncytin. The sequences homologous to HERV-W were compared to a library containing reference sequences of the HERV-W family (ERVWE1) cut up into functional regions (LTR, gag, pol and env), using the RepeatMasker software (A. F. A. Smit and P. Green). These elements constitute the HERVgDB bank.

The probes making up the high-density chip were defined on a criterion of uniqueness of their sequences in the HERVgDB bank. The HERV-W proviral and solitary LTRs contained in the HERVgDB bank were extracted. Each of these sequences was broken down into a set of sequences of 25 nucleotides (25-mers) constituting it, i.e. as many potential probes. The evaluation of the uniqueness of each probe was carried out by means of a similarity search with all the 25-mers generated for all the LTRs of the family under consideration. This made it possible to identify all the 25-mers of unique occurrence for each family of HERV. Next, some of these 25-mers were retained as probes. For each U3 or U5 target region, a set of probes was formed on the basis of the probes identified as unique.

The samples analyzed using the HERV high-density chip correspond to RNAs extracted from tumors and to RNAs extracted from the healthy tissues adjacent to these tumors. The tissues analyzed are: uterus, colon, lung, breast, testicle, prostate and ovary. Placental RNAs (health tissue only) were also analyzed. For each sample, 400 ng of total RNA were amplified by means of an unbiased transcriptional method known as WTA. The principle of WTA amplification is the following: primers (RP-T7) comprising a random sequence and a T7 promoter sequence are hybridized to the transcripts; double-standard cDNAs are synthesized and serve as a template for transcriptional amplification by the T7 RNA polymerase; the antisense RNAs generated are converted to double-stranded cDNAs which are then fragmented and labeled by introducing biotinylated nucleotide analogs at the 3′OH ends using terminal transferase (TdT) (cf. FIG. 1).

For each sample, 16 μg of biotin-labeled amplification products were hybridized to a DNA chip according to the protocol recommended by the company Affymetrix. The chips were then washed and labeled, according to the recommended protocol. Finally, the chips were read by a scanner in order to acquire the image of their fluorescence. The image analysis carried out using the GCOS software makes it possible to obtain numerical values of fluorescence intensity which are preprocessed according to the RMA method (cf.: FIG. 2) before being able to carry out a statistical analysis in order to identify the HERV loci specifically expressed in certain samples.

Comparison of the means of more than two classes of samples was carried out by the SAM procedure applied to a Fisher test.

Results:

The processing of the data generated by the analysis on DNA chip using this method made it possible to identify six sets of probes corresponding to an overexpression in just one sample: the tumoral testicle. These five sets of probes are specific for six precise loci referenced HW4TT, HW2TT, HW13TT, HWXTT, HW21TT and ERVWE1 (cf.: FIG. 3). These six loci therefore represent markers for testicular cancer. Their nucleotide sequences are respectively identified in SEQ ID Nos. 1 to 6 in the sequence listing and the nucleotide sequences of their respective transcripts are identified in SEQ ID Nos. 7 to 12 in the sequence listing.

The information relating to the abovementioned six loci are summarized in Table 1 below.

TABLE 1 Locus SEQ ID No: Chromosome Position* HW4TT 1 4 41982184:41989670 HW2TT 2 2 17383689:17391462 HW13TT 3 13 68693759:68699228 HWXTT 4 X 113026618:113027400 HW21TT 5 21 27148627:27156168 ERVWE1 6 7 91935221:91945670 *Position relative to ensemble version No. 39 (June 2006) (NCBI No. 36) http://www.ensembl.org/Homo_sapiens/index.html

The HW13TT locus is a chimeric provirus of HERV-W/L type resulting from the recombination of an HERV-W provirus and an HERV-L provirus. This chimera is such that the 5′ region made up of the sequence starting from the beginning of the 5′ LTR to the end of the determined gag fragment is of W type and the 3′ region made up of the sequence starting from the subsequent pol fragment to the end of the 3′ LTR (U3-R only) is of L type. This results in a fusion of the 3′ gag W-5′ pol L regions.

A search of open reading frames (ORFs) of at least 150 bases, using the Mac Vector 9.5.2 software, based on the identification of a start codon and of a stop codon, was carried out and the corresponding polypeptides identified.

The ORF 1 of HW4TT identified in SEQ ID No. 13 encodes a Gag protein identified in SEQ ID No. 14 and the ORF 2 of HW4TT (SEQ ID No. 15) encodes a protease (SEQ ID No. 16),

the ORF1 of HW2TT identified in SEQ ID No. 17 encodes a Gag protein identified in SEQ ID No. 18 and the ORF 2 of HW2TT (SEQ ID No. 19) encodes a protein identified in SEQ ID No. 20,

the ORF of HW13TT identified in SEQ ID No. 21 encodes a Gag protein identified in SEQ ID No. 22,

the ORF of HW21TT identified in SEQ ID No. 23 encodes a Gag protein identified in SEQ ID No. 24,

the ORF of ERVWE1 identified in SEQ ID No. 25 encodes an Env protein identified in SEQ ID No. 26.

Example 2 Validation of the Loci Overexpressed in the Tumoral Testicle and Determination of the Associated Induction Factor

Principle:

Five of the six loci identified as overexpressed in the tumoral testicle by means of the high-density HERV chip were validated by real-time RT-PCR on three pairs of testicular samples. The specificity of this overexpression is evaluated by analyzing samples originating from other tissues. To this end, specific amplification systems were developed and used for the loci identified, as described in Table 2 below.

TABLE 2 Locus Sense primer (SEQ ID No:) Antisense primer (SEQ ID NO:) G6PD gene TGCAGATGCTGTGTCTGG (27) CGTACTGGCCCAGGACC (28) HW4TT GGTTCGTGCTAATTGAGCTG (29) ATGGTGGCAAGCTTCTTGTT (30) HW2TT TGAGCTTTCCCTCACTGTCC (31) TGTTCGGCTTGATTAGGATG (32) HW13TT CATGGCCCAATATTCCATTC (33) GGTCCTTGTTCACAGAACTCC (34) HWXTT CCGCTCCTGATTGGACTAAA (35) CGTGGGTCAAGGAAGAGAAC (36) HW21TT ATGACCCGCAGCTTCTAACAG (37) CTCCGCTCACAGAGCTCCTA (38)

The expression of these loci is standardized with respect to that of a suitable housekeeping gene: G6PD. This quantification of expression was carried out using an Mx3005P real-time RT-PCR machine, marketed by the company Stratagene.

Results:

The study of the three pairs of testicular samples indicates that the five loci identified, with the exception of HWXTT, the expression of which could not be quantified in the second testicular RNA pair, are overexpressed in the tumoral testicle compared with the health tissue (cf.: FIG. 4). The very marked nature of the overexpression, i.e. a low or even absent transcriptional expression in the healthy testicle and a high expression in the tumoral testicle, reveals the possibility of an epigenic method of regulation of transcription of these loci.

The analysis of pairs of samples originating from other tissues (colon, uterus, breast, ovary, lung and prostate) shows that the overexpression phenomenon is restricted to the tumoral testicle. Consequently, the expression of the identified loci assumes the nature of a marker specific for testicular cancer.

Example 3 Epigenetic Control of Transcription

Principle:

DNA methylation is an epigenetic modification which takes place, in eukaryotics, by the addition of a methyl group to the cytosines of 5′-CpG dinucleotides, and results in transcriptional repression when this modification occurs within the nucleotide sequence of a promoter. Apart from a few exceptions, human endogenous sequences of retroviral origin are restricted, owing to this methylation process, to a silent transcriptional state in the cells of the organism under physiological conditions.

In order to analyze the methylation status of the unique LTR or of the 5′ LTR of the five loci, the “bisulfite sequencing PCR” method was used. This method makes it possible, on the basis of sequencing a representative sample of the population, to identify the methylation state of each CG dinucleotide on each of the sequences within the tissue studied.

Since the methylation information is lost during the amplification steps, it is advisable to translate the methylation information actually within the nucleotide sequence by means of the method of treating the genomic DNA with sodium bisulfite. The action of the bisulfite (sulfonation), followed by hydrolytic deamination and then alkaline desulfonation, in fact makes it possible to modify all the cytosines contained in the genomic DNA, into uracil. The speed of deamination of sulfonated cytosines (C) is, however, much higher than that of the sulfonated 5-methyl-Cs. It is therefore possible, by limiting the reaction time to 16 hours, to convert strictly the non-methylated cytosines to uracil (U), while at the same time preserving the cytosines which have a methyl group. After the sodium bisulfite treatment, the sequence of interest is amplified from the genomic DNA derived from the tumoral testicular section and from that derived from the adjacent healthy testicular section, by polymerase chain reaction (PCR) in two stages. The first PCR enables a specific selection of the sequence of interest, the second, “nested”, PCR makes it possible to amplify this sequence.

Since the DNA sequence had been modified by the bisulfate, the design of the primers took into account the code change (C to U), and the primers were selected so as to hybridize to a region containing no CpG (their methylation state, and therefore their conversion state, being a priori unknown).

The sequences of the primers used are described in Tables 3 to 8 below.

TABLE 3 HW4TT locus Sense primer 5′→3′ Antisense primer 5′→3′ Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR CCAACATCACTAACACAACCT (39) GGGAGTTAGTAAGGGGTTTG (40) Nested PCR CAACCTATTAAACAAAACTAAATT (41) AGATTTAATAGAGTGAAAATAGAGTTT (42)

TABLE 4 HW2TT locus Sense primer 5′→3′ Antisense primer 5′→3′ Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TTATTAGTTTAGGGGATAGTTG (43) ACACAATAAACAACCTACTAAAT (44) Nested PCR GAGGGTAAGTGGTGATAAA (45) AACCTACTAAATCCAAAAAAA (46)

TABLE 5 HW13TT locus Sense primer 5′→3′ Antisense primer 5′→3′ Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TAGGATTTTAGGTTTATTGTTA (47) AAAAATAAAATATTAAACC (48) Nested PCR ATATGTGGGAGTGAGAGATA (49) CAACAACAAACAATAATAATAA (50)

TABLE 6 HWXTT locus Sense primer 5′→3′ Antisense primer 5′→3′ Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TTGAGTTTTTTTATTGATAGTG (51) TCTAAATCCTATTTTCCTACT (52) Nested PCR GTTTTTTTATTGATAGTGAGAGAT (53) TAACAAACCTTTAATCCAAT (54)

TABLE 7 HW21TT locus Sense primer 5′→3′ Antisense primer 5′→3′ Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR TTTAGTGAGGATGATGTAATAT (55) CAACTTAATAAAAATAAACCCA (56) Nested PCR ATAATGTTTTAGTAAGTGTTGGAT (57) ACAATTACAAACCTTTAACC (58)

TABLE 8 ERVWE1 locus Sense primer 5′→3′ Antisense primer 5′→3′ Amplification (SEQ ID No.:) (SEQ ID No.:) First PCR AATTCATTCAACATCCATTC (59) GGTTTAATATTATTTATTATTTTGGA (60) Nested PCR CTCTTACCTTCCTATACTCTCTAAA (61) AGAGTGTAGTTGTAAGATTTAATAGAGT (62)

After extraction on a gel and purification, the amplicons are cloned into plasmids, and the latter are used to transform competent bacteria. About twelve plasmid DNA mini preparations are carried out using the transformed bacteria and the amplicons contained in the plasmids are sequenced. The sequences obtained are then analyzed (cf.: FIGS. 5 to 10).

Results:

The analysis of the 5′ region of the transcripts of the loci identified was carried out by means of the 5′ Race technique. It in particular made it possible to show that the transcription is started at the beginning of the R region of the proviral 5′ LTR. This reflects the existence of a promoter role for the U3 region of the proviral 5′ LTR.

1. Methylation state of the U3 sequences of the 5′ LTR of the HW4TT locus:

The U3 sequence of the 5′ LTR of the HW4TT locus of reference contains 5 CpG sites:

-   -   a) in the sample of healthy testicular tissue: out of 12         sequences analyzed, 9 are completely methylated. The other 3         each time exhibit 1 CpG nonmethylated out of the 5 contained in         the U3 region. This therefore represents an overall methylation         of the U3 region of the 5′ LTR of the HW4TT locus amounting to         95% in the healthy testicular sample;     -   b) in the sample of tumoral testicular tissue: out of 12         sequences analyzed, 5 (i.e. 41.66% of the sequences) are         completely demethylated, 3 sequences have 4 CpGs out of 5         nonmethylated, 2 sequences have 2 CpGs out of 5 nonmethylated, 1         sequence has 1 CpG out of 5 nonmethylated, and 1 sequence         remains completely methylated. This therefore represents an         overall methylation of the U3 region of the 5′ LTR of the HW4TT         locus amounting to 30% in the tumoral testicular sample.

2. Methylation state of the U3 sequences of the 5′ LTR of the HW2TT locus:

The U3 sequence of the 5′ LTR of the HW2TT locus of reference contains 5 CpG sites:

-   -   a) in the sample of healthy testicular tissue: out of 12         sequences analyzed, 9 are completely methylated, 1 has its         2^(nd) CpG nonmethylated, 1 has the CpG at position 4         nonmethylated, 1 has the CpGs at positions 4 and 5         nonmethylated, and 3 sequences have point mutations on one or         two CpGs (one in position 3, one in position 5 and one in         positions 4 and 5), very probably reflecting PCR artifacts. This         therefore represents an overall methylation of the U3 region of         the 5′ LTR of the HW2TT locus amounting to 92.9% in the healthy         testicular sample;     -   b) in the sample of tumoral testicular tissue: out of 12         sequences analyzed, 6 are completely demethylated, 5 sequences         have one or two methylated CpG(s) (1 at position 1, 1 other at         position 5, 1 on positions 1 and 5, 2 at positions 4 and 5 and 1         at position 3). Finally, one sequence has 4 CpGs methylated out         of 5 (positions 1, 2, 4 and 5). This corresponds to an overall         methylation of the U3 region of the 5′ LTR of the HW2TT locus         amounting to 20% in the tumoral testicular sample.

3. Methylation state of the U3 sequences of the 5′ LTR of the HW13TT locus:

The U3 sequence of the 5′ LTR of the HW13TT locus of reference contains 3 CpG sites:

-   -   a) in the sample of healthy testicular tissue: an additional         CpG, compared with the reference sequence, is found in 4 of the         10 clones studied for this locus. It is located between CpGs 2         and 3 and is methylated. In the other 6 clones, this site is         mutated compared with the reference sequence. The other 3 CpGs         of the U3 region are methylated in the 10 sequences analyzed.         This therefore represents an overall methylation of the U3         region of the 5′ LTR of the HW13TT locus amounting to 100% in         the healthy testicular sample;     -   b) in the sample of tumoral testicular tissue: the additional         CpG indicated above is also found. It is demethylated in 4 of         the 10 sequences analyzed, mutated in 3 other sequences, and its         methylation state is indeterminate in the last 3 sequences. 7         sequences out of 10 are completely demethylated and the other 3         are methylated on the 2^(nd) and on the 3^(rd) CpG. This         corresponds to an overall methylation of the U3 region of the 5′         LTR of the HW13TT locus amounting to 20% in the tumoral         testicular sample.

4. Methylation state of the U3 sequences of the solitary LTR of the HWXTT locus:

The U3 sequence of the 5′ LTR of the HWXTT locus of reference contains 6 CpG sites:

-   -   a) in the sample of healthy testicular tissue: the 8 sequences         analyzed are completely methylated, which corresponds to a         methylation percentage of 100% in the healthy testicular sample;     -   b) in the sample of tumoral testicular tissue: the 9 sequences         analyzed 6 are completely demethylated, which corresponds to a         methylation percentage of 0%.

5. Methylation state of the U3 sequences of the 5′ LTR of the HW21TT locus:.

The U3 sequence of the 5′ LTR of the HW21TT locus of reference contains 7 CpG sites:

-   -   a) in the sample of healthy testicular tissue: the 10 sequences         analyzed all have 6 CpGs methylated out of 7; for 6 of the         sequences, the 1^(st) CpG is nonmethylated and for the other 4         sequences, the 4^(th) CpG is nonmethylated. This therefore         represents an overall methylation of the U3 region of the 5′ LTR         of the 1-1W21 TT locus amounting to 85.7% in the healthy         testicular sample;     -   b) in the sample of tumoral testicular tissue: out of 8         sequences analyzed, 6 are completely demethylated, 2 others         exhibit a profile identical to one of those found in the healthy         testicular tissue, namely 6 CpGs methylated and the 1^(st) CpG         nonmethylated. This corresponds to an overall methylation of the         U3 region of the 5′ LTR of the HW21TT locus amounting to 21.4%         in the tumoral testicular sample.

6. Methylation state of the sequences of the activator of the U3 of the 5′ LTR of the ERVWE1 locus:

The ERVWE1 locus comprises, in addition to its U3 promoter region, a known activator located directly upstream of the 5′ LTR, and which contains two CpG sites (CpG 1 and 2). The U3 sequence of the 5′ LTR of the ERVWE 1 locus of reference contains, for its part, 5 CpG sites (CpGs 3 to 7):

-   -   a) in the sample of healthy testicular tissue: out of 10         sequences analyzed, 5 sequences have CpGs 1 and 2 (activator)         and 5 (U3) nonmethylated, 1 sequence has CpGs 2 and 5         nonmethylated, 2 sequences have CpGs 1 (activator) and 7 (U3)         nonmethylated, 1 sequence has CpG 7 only nonmethylated and,         finally, 1 is completely methylated for the 7 CpGs. In total,         this corresponds to a methylation percentage of 68.57% in the         healthy testicular sample;     -   b) in the sample of tumoral testicular tissue: out of the 10         sequences analyzed, only 3 sequences exhibit, for each one, a         unique methylated CpG (CpG 4 or CpGS or CpG6), the other 7         sequences are completely demethylated, which corresponds to a         methylation percentage of 4.29%.

The very high level of methylation of the U3 retroviral promoters of the loci considered, in the healthy tissue, is correlated with the low, or even absent, transcription expression of the U5 regions which correspond to the loci considered, indicating a repression of the transcriptional expression by an epigenetic mechanism. On the other hand, the low level of methylation of these same promoters in the tumoral tissue reflects a lifting of transcriptional inhibition, the result of which is the significantly higher expression demonstrated by means of the high-density HERV DNA chip and by means of the real-time RT-PCR.

Literature references

[1] Nickerson D. A. et al., DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene, Nature Genetics, Vol. 19, pp 233-240 (1998).

[2] Cottrell S. E., Molecular diagnostic applications of DNA methylation technology, Clinical Biochemistry 37, pp 595-604 (2004). 

1. A method for in vitro diagnosis or prognosis of testicular cancer, in a biological sample from a patient suspected of suffering from testicular cancer, comprising a step of detecting presence or absence of at least one expression product from at least one nucleic acid sequence selected from the full-length sequences identified in SEQ ID NOS: 1 to 6 or from sequences which exhibit at least 99% identity with one of the full-length sequences identified in SEQ ID NOS: 1 to
 6. 2. The method as claimed in claim 1, wherein the expression product detected is at least one mRNA transcript or at least one polypeptide.
 3. The method as claimed in claim 2, wherein the mRNA transcript is detected by hybridization, by amplification, or by sequencing.
 4. The method as claimed in claim 1, wherein the expression product is at least one mRNA transcript, having a sequence selected from one of the full-length sequences identified in SEQ ID NOS: 7 to
 12. 5. The method as claimed in claim 1, wherein the mRNA is brought into contact with at least one probe and/or at least one primer under predetermined conditions which enable hybridization, and in that the presence or absence of hybridization to the mRNA is detected.
 6. The method as claimed in claim 1, wherein DNA copies of the mRNA are prepared and the DNA copies are brought into contact with at least one probe and/or at least one primer under predetermined conditions which enable hybridization, and in that the presence or absence of hybridization to said DNA copies is detected.
 7. The method as claimed in claim 2, wherein the polypeptide expressed is detected by bringing it into contact with at least one binding partner specific for said polypeptide.
 8. A molecular marker for in vitro diagnosis or prognosis of testicular cancer, the molecular marker comprising at least one isolated nucleic acid sequence selected from the group consisting of: (i) the full-length DNA sequences set forth in SEQ ID NOS: 1-6, (ii) the full-length complementary DNA sequences of the sequences defined in (iii) DNA sequences that exhibit at least 99% identity with the sequences defined in (i) and, or (ii), (iv) RNA sequences that are the products of transcription of the sequences defined in (i), (v) RNA sequences that are the products of transcription of sequences that exhibit at least 99% identity with the sequences defined in (i), and (vi) the full-length RNA sequences set forth in SEQ ID NOS: 7-12. 